Sure Independence Screening with NP-dimensionality
نویسنده
چکیده
Ultrahigh dimensional variable selection plays an increasingly important role in contemporary scientific discoveries and statistical research. A simple and effective method is the correlation screening. For generalized linear models, we propose a more general version of the independent learning with ranking the maximum marginal likelihood estimates or the maximum marginal likelihood itself. We show that the proposed methods possess the sure screening property with vanishing false selection rate. We quantify explicitly the extent to which the dimensionality can be reduced by independence screening, which depends on the covariance matrix of covariates and true parameters. An iterative version of large-scale screening and moderate-scale selection is introduced to deal with the difficult situation where independence screening might fail. The effectiveness of the methods is demonstrated by several simulation examples and case studies.
منابع مشابه
SURE INDEPENDENCE SCREENING IN GENERALIZED LINEAR MODELS WITH NP-DIMENSIONALITY∗ By
Princeton University and Colorado State University Ultrahigh dimensional variable selection plays an increasingly important role in contemporary scientific discoveries and statistical research. Among others, Fan and Lv (2008) propose an independent screening framework by ranking the marginal correlations. They showed that the correlation ranking procedure possesses a sure independence screening...
متن کاملSure Independence Screening in Generalized Linear Models with Np-dimensionality1 By
Ultrahigh-dimensional variable selection plays an increasingly important role in contemporary scientific discoveries and statistical research. Among others, Fan and Lv [J. R. Stat. Soc. Ser. B Stat. Methodol. 70 (2008) 849–911] propose an independent screening framework by ranking the marginal correlations. They showed that the correlation ranking procedure possesses a sure independence screeni...
متن کامل6 Sure Independence Screening for Ultra - High Dimensional Feature Space ∗
High dimensionality is a growing feature in many areas of contemporary statistics. Variable selection is fundamental to high-dimensional statistical modeling. For problems of large or huge scale pn, computational cost and estimation accuracy are always two top concerns. In a seminal paper, Candes and Tao (2007) propose a minimum l1 estimator, the Dantzig selector, and show that it mimics the id...
متن کاملNonparametric Independence Screening in Sparse Ultra-High Dimensional Additive Models.
A variable screening procedure via correlation learning was proposed in Fan and Lv (2008) to reduce dimensionality in sparse ultra-high dimensional models. Even when the true model is linear, the marginal regression can be highly nonlinear. To address this issue, we further extend the correlation learning to marginal nonparametric learning. Our nonparametric independence screening is called NIS...
متن کاملDiscussion of "Sure Independence Screening for Ultra-High Dimensional Feature Space.
June 30, 2008 Abstract Variable selection plays an important role in high dimensional statistical modeling which nowadays appears in many areas and is key to various scientific discoveries. For problems of large scale or dimensionality p, estimation accuracy and computational cost are two top concerns. In a recent paper, Candes and Tao (2007) propose the Dantzig selector using L1 regularization...
متن کامل